Nowadays, there are many causes of crime and the knowledge of these causes set the basis of setting preventive measures and policing. In the given project, crime data in Colchester in the years 2024 and 2025 are studied and the correlation between the crime rates and weather conditions is investigated. The patterns of criminal behavior and the way that the environmental factors such as temperature, precipitation, and wind could affect them are discussed with the help of street-level crime reports data and daily weather data obtained in a local weather station.
To identify such patterns and offer some insights concerning how the crime landscape in Colchester changes depending on the particular weather conditions, this report combines the statistical graphics, interactive visualizations, and geographic plots.
In order to be ready to be analysed, the crime and weather datasets were cleaned and transformed. Cleaning of the crime data was done to make it consistent and relevant to be analyzed. Columns with names that are not easily understood were renamed to be more readable and standardized e.g category renamed to crime_type, lat to latitude and long renamed to longitude. The date started in a YYYY-MM format, that was turned into a complete date, by adding -01 to every record, this way deleting the possible problem of not having a complete date and thus being unable to handle dates and plot time series. The type of the crime was also changed to title case to be consistent, and only the appropriate variables (date, crime_type, latitude, longitude, street_name, and outcome_status) were kept. To provide the correct geospatial analysis, all the rows where the latitude or longitude value is empty were deleted.
Similarly, the weather datasets for 2023–2024 and 2024–2025 underwent structured cleaning using a custom function. This purpose shortened long variable names (e.g, TemperatureCAvg to tavg, Precmm to precipitation), converted the string-formatted dates to Date objects, and extracted only the most climatologically interesting variables within the dataset (e.g, temperature, humidity, wind speed, rainfall, sunshine, cloud cover). Using the same process of cleaning data in both years, we were able to have the same format and structure between datasets. Moreover, daily crime counts were then aggregated and merged with the related weather data based on dates, to get a combined dataset that connects weather data with daily frequency of crime, so as to be used in its entirety and consistencies to provide additional analysis.
Figure 1: Frequency of Crime Types
The table and bar plot of reported crime types in Colchester (2024–25) in (Figure 1) indicate that Violent Crime is, undoubtedly, the most popular one, with more than twice as many crimes, compared to next category. There is also prevalence of Anti-Social Behaviour and Shoplifting, which shows that there is a general disturbance of the people and the property crimes. The categories of mid range, i.e., Criminal Damage & Arson, Public Order and Other Theft propose a stable presence of disorderly and enterprising crime. Robbery, Theft of the Person, and Possession of Weapons are not as common and can be found only a few times. On the whole, the distribution can indicate that the local crime landscape is dominated by violent and antisocial crimes, which can be seen as important focus areas of the targeted policing and community safety activities.
Figure 2: Crime Outcomes
The pie chart as shown in (Figure 2) interprets the pattern of the outcomes of reported crimes in Colchester, during 2024 and 2025. The largest segment is the ones that are labeled as “Investigation complete; no suspect identified,” where most in-investigations did not have an answer. A large percentage of cases also are labeled as “Under investigation,” where presumably there is a backlog or still enquiries taking place. It is important to note that the non-insignificant proportion is classified with the label of the Unknown and this stresses on the problem of data reporting or subsequent monitoring. Other outcomes such as “Awaiting court outcome” or “Formal action is not in the public interest” occur less frequently but still reflect the diversity in how cases are processed. As a whole, the chart highlights that most of the crimes reported do not lead to the solution of finding a suspect, or are left unsolved, thus showing the possible issues with the closure of the case and its continuation.
Figure 3: Violin Plot of Daily Crime Spread
In Figure 3 shown in above, a violin plot is used to determine the distribution and dispersion of the number of crimes per day in Colchester within the study period. The widest part of the plot about 480 to 520 crimes shows the most popular crime numbers per day. It is quite symmetrical in shape, implying that the distribution of the daily counts of crime is quite balanced, with no particular skewing towards large or low counts. The tapered tails revealed that few days are extreme (more than 600 or less than 430) in terms of the amount of crime. In general, this visualization shows that there had been always a certain range around which the daily crime rates were concentrated, with only some sparse cases being either above-average or below-average. It strengthens the idea that occurrence of crime in Colchester has a diagonal daily routine within the year.
Figure 4: Time Series of Monthly Crime Trend
This line plot in Figure 4 illustrates the monthly trend in reported crimes in Colchester from April 2024 to early 2025. The red line indicates the actual number of crimes in a month whereas the black curve indicates a LOESS smoothing line, which captures on the large scale by removing any short-term variations. The trend indicates that crime rates are observed to be higher in the summer especially in the month of July when they are highest in July of 2024, and afterward, it drops slowly towards fall and winter months with the lowest point recorded in January 2025. This seasonal effect implies that the amount of crime increases in summer and decreases in winters, which most likely is related to the shift in the outdoor activity rate as well as the social behavior and the occurrence of public events.
Figure 5: Map of Crime Locations
Figure 5 is an interactive leaflet map which, shows the spatial distribution of reported crimes within the Colchester in the 2024-2025. Every red dot represents the place where a crime happened and is plotted in its latitude and longitude coordinates. The markers in the central parts of Colchester, especially near the town, as well as in Lexden and Hythe, are clustered together meaning that these are hotspots of criminal activity, probably due to an increased population, shopping, and accessible area to people. Conversely, peripheral neighborhoods are less likely to encounter such incidents showing low incidences of reported crimes. This map could be a very good geographic representation, as interactive zooms and pop-ups with crime type and street names can be used to identify areas where specific enforcement of law or community safety would be required.
Figure 6: Word Cloud of Crime Street Names
The word cloud shown in Figure 6 maps the most common words in the list of the street names where there have been reports on crimes in Colchester in 2024-2025. The bigger words refer to higher frequences in the data collection. The word, that stands out the most, is the word, near, which means that a lot of the crime locations were documented as a reference to e.g. landmarks or being nearby (e.g. near supermarket or near station). Other common words that have been observed are road, street, area, supermarket, shopping, and station which indicates that a crime was normally committed in a public space or in business premises. The presence of such terms as nightclub, parking, church, and centre also shows that hotspots of crimes are gathered in city centres and other crowded areas with high store of traffic or social life. This visualization offers a qualitative understanding of crime location, enhancing what its already observed in the leaflet map and putting more details regarding the open spaces of the sorts that may need closer attention or crime prevention.
Figure 7: Average Temperature Comparison (2023–2025)
Figure 7 compares average temperatures of two verticals, 2023-2024 and 2024 -2025 between July (the summer) and January (the winter), reflecting a distinct seasonal peak in July and a trough in January as would be expected in the Northern Hemisphere. The 2024–2025 line consistently runs higher than 2023–2024, which means that average temperatures in corresponding months of 2024 are warmer in comparison with 2023, whereas July 2024 is warmer than July 2023 and January 2025 is warmer than January 2024. To sum up, it means that the period of 2024-2025 was generally warmer than the previous one (2023-2024) when measured over the same months (July to January) and that the temperature during the summer and winter were higher in the second year. This may have a great implication on climate analysis should the trend persist.
Figure 8: Average Temperature of both years
The density plot of the average daily temperatures in both years 2023-2024 and 2024-2025 in Figure 8 shows that the average temperature was moderate on most days in Colchester, with the temperatures most frequently ranged between 10°C and 15°C, where the curve is highest. At about 5°C there is a noticeable increasing trend and it is easy to see that there were quite a number of colder days, whereas after 18°C the distribution gradually declines on the right indicating that there were warmer days with fewer occasions. The figure of the curve obtained is slightly left-skewed with a high dispersion of mild and warm temperature values than cold ones. Therefore, this plot can give a good overview of the overall weather situation in the two years.
Figure 9: Interactive Correlation of Weather Variables
Figure 9 shows the interactive correlation matrix indicating the relationship between the most crucial factors of weather in Colchester between 2023 and 2025. As is usual, the temperature variables tmin, tavg and tmax-are highly correlated positively with one another (darkest blue) confirming that minimum, average and maximum temperatures do tend to increase or drop simultaneously.
The sunshine shows positive relationship with the visibility, and negative relationship to cloud factors (cloud_total, cloud_low) and humidity, indicating clearer and drier skies when there is sunshine. Similarly, precipitation is also moderately negatively correlated with sunshine and visibility implying that rainy days are more cloudy and opaque.
Wind-related variables (wind_speed, wind_gust) are tightly related among themselves, however, have weaker relationships with other features. The things that are similar are shown in hierarchical clustering (tree branches) in each of the two axes, which gives further confirmations of the similarities among the variables in graphical form, such as the close clustering among sunshine, temperature and visibility.
Overall, this matrix gives a great sense of the internal relationships that exist within weather variables over a period of two years and an understanding that the meteorological behavior expected in Colchester will remain true to its expected behavior.
Figure 10: Monthly average of Weather Variables (2023-2025)
This multi line plot in Figure 10 is about the average monthly trends of four important weather measurements like temperature (tavg), humidity, precipitation and sunshine can be seen across the combined years of 2023 to 2025, in Colchester. Both sunshine and average temperature hit its peak at the summer months and declines in winter following the normal seasonal behavior of the Northern Hemisphere. The level of humidity is always rather high and drops slightly in the summer and early winter, which is typical of atmospheric weather conditions. The amount of precipitation is somewhat different and more sporadic, with uneven polls indicating separate wet intervals as opposed to a pronounced season.
Figure 11: Scatter Plot of Temperature Vs. Daily Crime Count
The interactive scatter plot in Figure 11 depicts the dependence of the average daily temperature and the numer of reported crimes in Colchester. The dots indicate the number of days, and this depiction illustrates the changes in the frequency of crime as the temperature changes. The positive results of the correlation as shown by the upward-sloping linear regression line and the shaded confidence interval imply that there is a positive correlation and thus as average temperature is increasing, the crimes are also likely to increase. Although the data can be changed a bit, this tendency seems to be rather stable as warm days are associated with larger numbers of a crime. The current result is in accordance with existing research on criminology that the higher the temperature the higher the social activity and conflict that may increase the reported cases. (Field 1992) It is however noted that this relationship is not linear in every aspect and there can be other variables affecting this relationship and not represented in this plot.
| temp_range | total_crimes |
|---|---|
| Cold (<5°C) | 912 |
| Mild (5–15°C) | 3546 |
| Warm (>15°C) | 1052 |
The data presented in the table above indicate that when it comes to the analysis of crime frequencies in accordance to temperature range, most crimes (3,546 incidents) were committed during moderate weather (5-15 o C), which implies that moderate weather condition might present cases of outdoor activities and social interactions, and thus heighten crime opportunities. On the contrary, there are no extreme temperature categories, probably because of local climate, and both 912 and 1,052 observations present missing values of temperature, which may be caused by either incomplete weather data or incompatible dates in the case of data conjoined. This trend confirms the suggestion that comfortable weather conditions are associated with increased levels of crime, but care should be taken because missing data values are present.
In conclusion, the analysis shows that crime in Colchester does rise with warmer and sunnier weather and that violent and anti-social crime becomes most prevalent. Further, weather patterns conformed to the regional pattern, and a positive relationship between temperature and crime indicates that climate may affect crime.